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1 Call graph prefetching for database applications 
Murali Annavaram, Jignesh M. Patel, Edward S. Davidson 

November 2003 ACM Transactions on Connputer Systems (TOCS), Volume 2: 

Full text available:^ pdf{701 .71 KB) Additional Informatbn: full citation, abstract, referenc* • 

With the continuing technological trend of ever cheaper and larger memory, n 
soon be able to reside in main memory. In this configuration, the performance 
between the processing speed of the CPU and the memory access latency. Pre 
applications have large instruction and data footprints and hence do not use p 
paper, we propose Call Graph Prefetching (CGP), ... 



Keywords: Instruction cache prefetching, call graph, database 



2 Session 3: Energy-aware OS's: The benefits of event: driven energy accou 
Frank Bellosa 

September 2000 Proceedings of the 9th workshop on ACM SIGOPS European works 

the operating system 
Full texl available:® pdf(86.80 KB) Additional Information: full citation, abstract, refers 

A prerequisite of energy-aware scheduling is precise knowledge of any activity 
Embedded hardware monitors (e.g., processor performance counters) have pr 
field of performance analysis. The same approach can be applied to investigat 
individual threads. We use information about active hardware units (e.g., inte 
interface) gathered by event counters to establish a t ... 
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3 Energy efficient architectures: Reducing power requirements of instruction 
allocation of multiple datapath resources 

Dmitry Ponomarev, Gurhan Kucuk, Kanad Ghose 

Full text available:"! pcjf(i .51 MB) il Publisher Site Additional Information: full citation, 

The "one-size-fits-all" philosophy used for permanently allocating datapath re 
maximize performance across a wide range of applications results in the overc 
reduce power dissipation in the datapath, the resource allocations can be dyni 
of applications. We propose a mechanism to dynamically, simultaneously and 
issue queue (IQ), the reorder buffer (R ... 

Keywords: dynamic instruction scheduling, energy-efficient datapath, power r 

4 Using a user-level memory thread for correlation prefetching 
Yan Solihin, Jaejin Lee, Josep Torrellas 

May 2002 ACM SIGARCH Computer Architecture News, Volume 30 Issue 2 
Full text available:^ pdf(1 .48 MB) S Publisher Site Additional Information: full citation, abstrac; 

This paper Introduces the idea of using a User-Level Memory Thread (ULMT) f 
approach, a user thread runs on a general-purpose processor in main memory 
in a DRAM chip. The thread performs correlation prefetching in software, send 
cache of the main processor. This approach requires minimal hardware beyonc 
table is a software data structure that reside ... 

Keywords: data prefetching, intelligent memory, processing-ln-memory, comf 
prefetching, memory hierarchies, caches, threads 



5 Memory-wall: Execution history guided instruction prefetching 

Yi Zhang, Steve Haga, Rajeev Barua 

June 2002 Proceedings of the 16th international conference on Supercomput 

Full text available:^ pdf(218.17 KB) Additional Information: full citation, abstract, refereni 

The increasing gap in performance between processors and main memory has 
techniques more important than ever. A major deficiency of existing prefetchi 
an extra port to I-cache. A recent study by [19] shows that this factor alone e 
microprocessors do not use such hardware-based I-cache prefetch schemes. T 
First we present a method that does not require an ... 

Keywords: hardware, instruction cache, performance, prefetching 
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6 Delayed Internet routing convergence 
Craig Labovitz, Abha Ahuja, Abhijit Bose, Farnam Jahanian 
June 2001 IEEE/ACM Transactions on Networking (TON), Volume 9 Issue 3 
Full text available:*^ pdf(220.26 KB) Additional Information: full citation, abstract, references, 

This paper examines the latency in Internet path failure, failover, and repair c 
interdomain routing. Unlilce circuit-switched paths which exhibit failover on th 
experimental measurements show that interdomain routers in the pacl<et-swit 
to reach a consistent view of the network topology after a fault. These delays 
fluctuations formed during the operation of the Bo ... 

Keywords: Internet, failure analysis, network reliability, routing 



7 Dead-block prediction & dead-block correlating prefetchers 

An-Chow Lai, Cem Fide, Babak Falsafi 

May 2001 ACM SIGARCH Computer Architecture News , Proceedings of the 28th a 
Computer architecture, Volume 29 Issue 2 

Full text availabler-g pdf(972.60 KB) Publisher Site Additional Information: full citation, abstrs 

Effective data prefetching requires accurate mechanisms to pre< 
cache blocl<s to prefetch and “when8Lrdquo; to prefetch t 
Dead-Block Predictors (DBPs), trace-based predictors that accu 
“when” an LI data cache block becomes evictable 
Predicting a dead block significantly enhances prefetching looka 
enables placing data directly into LI, obviating the n ... 

8 Delayed Internet routing convergence 

Craig Labovitz, Abha Ahuja, Abhijit Bose, Farnam Jahanian 
August 2000 ACM SIGCOMM Computer Communication Review , Proceedings of th 
Technologies, Architectures, and Protocols for Computer Communica 

Full text available:'^ pdf{31 3.83 KB) Additional Information: full citatbn, abstract, references, 

This paper examines the latency in Internet path failure, failover and repair di 
inter-domain routing. Unlike switches in the public telephony network which e 
milliseconds, our experimental measurements show that inter-domain routers 
take tens of minutes to reach a consistent view of the network topology after 
temporary routing table oscillations formed during ... 
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9 A flow-based approach to datagram security 
Suvo Mittra, Thomas Y. C. Woo 

October 1997 ACM SIGCOMM Computer Communication Review , Proceedings of tl 
Applications, technologies, architectures, and protocols for compute 

Full text available:® pdf(2.04 MB) Additional Information: full Cftation, abstract, references, ci 

Datagram services provide a simple, flexible, robust, and scalable communica 
been well demonstrated by the success of IP, UDP, and RPC. Yet, the overwhe 
protocols that have been proposed are geared towards connection-oriented co 
datagram communications tend to either rely on long term host-pair keying oi 
requiring connection setup) semantics. Separately, t ... 

10 Scheduling and page migration for multiprocessor compute servers 
Rohit Chandra, Scott Devlne, Ben Verghese, Anoop Gupta, Mendel Rosenblum 
November 1994 Proceedings of the sixth International conference on Architectural 

operating systems. Volume 29 , 28 Issue 11,5 
Full text available:® pdf(1 .56 MB) Additional Information: full citation, abstract, references, ci 

Several cache-coherent shared-memory multiprocessors have been developed 
coupling between the processing resources. They are therefore quite attractlv( 
multiprogramming and parallel application workloads. Process scheduling and 
challenging due to the distributed main memory found on such machines. Thi: 
scheduling and page migration policies on the perfo ... 

11 Techniques for compressing program address traces 
Andrew R. Pleszkun 

November 1994 Proceedings of the 27th annual international symposium on M 

Full text available:® pdf(931 .63 KB) Additional Information: full citation, abstract, references, 

In this paper a technique for generating consistent, reproducible traces with a 
compression than standard general-purpose compression programs Is describe 
once, an intermediate form is generated and then read as the input to the sec 
program source code is required, and this technique will work on address stre. 
the way the address trace Is encod ... 

Keywords: compression, trace generation 
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